61 research outputs found

    Schema Vacuuming in Temporal Databases

    Get PDF
    Temporal databases facilitate the support of historical information by providing functions for indicating the intervals during which a tuple was applicable (along one or more temporal dimensions). Because data are never deleted, only superceded, temporal databases are inherently append-only resulting, over time, in a large historical sequence of database states. Data vacuuming in temporal databases allows for this sequence to be shortened by strategically, and irrevocably, deleting obsolete data. Schema versioning allows users to maintain a history of database schemata without compromising the semantics of the data or the ability to view data through historical schemata. While the techniques required for data vacuuming in temporal databases have been relatively well covered, the associated area of vacuuming schemata has received less attention. This paper discusses this issue and proposes a mechanism that fits well with existing methods for data vacuuming and schema versioning

    Experiences in building a tool for navigating association rule result sets

    Get PDF
    Practical knowledge discovery is an iterative process. First, the experiences gained from one mining run are used to inform the parameter setting and the dataset and attribute selection for subsequent runs. Second, additional data, either incremental additions to existing datasets or the inclusion of additional attributes means that the mining process is reinvoked, perhaps numerous times. Reducing the number of iterations, improving the accuracy of parameter setting and making the results of the mining run more clearly understandable can thus significantly speed up the discovery process. In this paper we discuss our experiences in this area and present a system that helps the user to navigate through association rule result sets in a way that makes it easier to find useful results from a large result set. We present several techniques that experience has shown us to be useful. The prototype system – IRSetNav – is discussed, which has capabilities in redundant rule reduction, subjective interestingness evaluation, item and itemset pruning, related information searching, text-based itemset and rule visualisation, hierarchy based searching and tracking changes between data sets using a knowledge base. Techniques also discussed in the paper, but not yet accommodated into IRSetNav, include input schema selection, longitudinal ruleset analysis and graphical visualisation techniques.Adelaide, S

    A signature-based indexing method for efficient content-based retrieval of relative temporal patterns

    Get PDF

    Detecting anomalous longitudinal associations through higher order mining

    Get PDF
    The detection of unusual or anomalous data is an important function in automated data analysis or data mining. However, the diversity of anomaly detection algorithms shows that it is often difficult to determine which algorithms might detect anomalies given any random dataset. In this paper we provide a partial solution to this problem by elevating the search for anomalous data in transaction-oriented datasets to an inspection of the rules that can be produced by higher order longitudinal/spatio-temporal association rule mining. In this way we are able to apply algorithms that may provide a view of anomalies that is arguably closer to that sought by information analysts.Sydney, NS

    SemGrAM - Integrating semantic graphs into association rule mining

    Get PDF
    To date, most association rule mining algorithms have assumed that the domains of items are either discrete or, in a limited number of cases, hierarchical, categorical or linear. This constrains the search for interesting rules to those that satisfy the specified quality metrics as independent values or as higher level concepts of those values. However, in many cases the determination of a single hierarchy is not practicable and, for many datasets, an item’s value may be taken from a domain that is more conveniently structured as a graph with weights indicating semantic (or conceptual) distance. Research in the development of algorithms that generate disjunctive association rules has allowed the production of rules such as Radios V TVs -> Cables. In many cases there is little semantic relationship between the disjunctive terms and arguably less readable rules such as Radios V Tuesday -> Cables can result. This paper describes two association rule mining algorithms, SemGrAMG and SemGrAMP, that accommodate conceptual distance information contained in a semantic graph. The SemGrAM algorithms permit the discovery of rules that include an association between sets of cognate groups of item values. The paper discusses the algorithms, the design decisions made during their development and some experimental results.Sydney, NS

    A survey of temporal knowledge discovery paradigms and methods

    Get PDF
    With the increase in the size of data sets, data mining has recently become an important research topic and is receiving substantial interest from both academia and industry. At the same time, interest in temporal databases has been increasing and a growing number of both prototype and implemented systems are using an enhanced temporal understanding to explain aspects of behavior associated with the implicit time-varying nature of the universe. This paper investigates the confluence of these two areas, surveys the work to date, and explores the issues involved and the outstanding problems in temporal data mining

    Establishing a lineage for medical knowledge discovery

    Get PDF
    Medical science has a long history characterised by incidents of extraordinary insights that have resulted in a paradigm shift in the methodologies and approaches used and have moved the discipline forward. While knowledge discovery has much to offer medicine, it cannot be done in ignorance of either this history or the norms of modern medical investigation. This paper explores the lineage of medical knowledge acquisition and discusses the adverse perceptions that data mining techniques will have to surmount to gain acceptance.Sydney, NS

    On the impact of Knowledge Discovery and Data Mining

    Get PDF
    Knowledge Discovery and Data Mining are powerful automated data analysis tools and they are predicted to become the most frequently used analytical tools in the near future. The rapid dissemination of these technologies calls for an urgent examination of their social impact. This paper identifies social issues arising from Knowledge Discovery (KD) and Data Mining (DM). An overview of these technologies is presented, followed by a detailed discussion of each issue. The paper's intention is to primarily illustrate the cultural context of each issue and, secondly, to describe the impact of KD and DM in each case. Existing solutions specific to each issue are identified and examined for feasibility and effectiveness, and a solution that provides a suitably contextually sensitive means for gathering and analysing sensitive data is proposed and briefly outlined. The paper concludes with a discussion of topics for further consideration
    • …
    corecore